788 research outputs found
Rerepresenting and Restructuring Domain Theories: A Constructive Induction Approach
Theory revision integrates inductive learning and background knowledge by
combining training examples with a coarse domain theory to produce a more
accurate theory. There are two challenges that theory revision and other
theory-guided systems face. First, a representation language appropriate for
the initial theory may be inappropriate for an improved theory. While the
original representation may concisely express the initial theory, a more
accurate theory forced to use that same representation may be bulky,
cumbersome, and difficult to reach. Second, a theory structure suitable for a
coarse domain theory may be insufficient for a fine-tuned theory. Systems that
produce only small, local changes to a theory have limited value for
accomplishing complex structural alterations that may be required.
Consequently, advanced theory-guided learning systems require flexible
representation and flexible structure. An analysis of various theory revision
systems and theory-guided learning systems reveals specific strengths and
weaknesses in terms of these two desired properties. Designed to capture the
underlying qualities of each system, a new system uses theory-guided
constructive induction. Experiments in three domains show improvement over
previous theory-guided systems. This leads to a study of the behavior,
limitations, and potential of theory-guided constructive induction.Comment: See http://www.jair.org/ for an online appendix and other files
accompanying this articl
Shearlets and Optimally Sparse Approximations
Multivariate functions are typically governed by anisotropic features such as
edges in images or shock fronts in solutions of transport-dominated equations.
One major goal both for the purpose of compression as well as for an efficient
analysis is the provision of optimally sparse approximations of such functions.
Recently, cartoon-like images were introduced in 2D and 3D as a suitable model
class, and approximation properties were measured by considering the decay rate
of the error of the best -term approximation. Shearlet systems are to
date the only representation system, which provide optimally sparse
approximations of this model class in 2D as well as 3D. Even more, in contrast
to all other directional representation systems, a theory for compactly
supported shearlet frames was derived which moreover also satisfy this
optimality benchmark. This chapter shall serve as an introduction to and a
survey about sparse approximations of cartoon-like images by band-limited and
also compactly supported shearlet frames as well as a reference for the
state-of-the-art of this research field.Comment: in "Shearlets: Multiscale Analysis for Multivariate Data",
Birkh\"auser-Springe
Probabilistic Reconstruction in Compressed Sensing: Algorithms, Phase Diagrams, and Threshold Achieving Matrices
Compressed sensing is a signal processing method that acquires data directly
in a compressed form. This allows one to make less measurements than what was
considered necessary to record a signal, enabling faster or more precise
measurement protocols in a wide range of applications. Using an
interdisciplinary approach, we have recently proposed in [arXiv:1109.4424] a
strategy that allows compressed sensing to be performed at acquisition rates
approaching to the theoretical optimal limits. In this paper, we give a more
thorough presentation of our approach, and introduce many new results. We
present the probabilistic approach to reconstruction and discuss its optimality
and robustness. We detail the derivation of the message passing algorithm for
reconstruction and expectation max- imization learning of signal-model
parameters. We further develop the asymptotic analysis of the corresponding
phase diagrams with and without measurement noise, for different distribution
of signals, and discuss the best possible reconstruction performances
regardless of the algorithm. We also present new efficient seeding matrices,
test them on synthetic data and analyze their performance asymptotically.Comment: 42 pages, 37 figures, 3 appendixe
On the performance of algorithms for the minimization of -penalized functionals
The problem of assessing the performance of algorithms used for the
minimization of an -penalized least-squares functional, for a range of
penalty parameters, is investigated. A criterion that uses the idea of
`approximation isochrones' is introduced. Five different iterative minimization
algorithms are tested and compared, as well as two warm-start strategies. Both
well-conditioned and ill-conditioned problems are used in the comparison, and
the contrast between these two categories is highlighted.Comment: 18 pages, 10 figures; v3: expanded version with an additional
synthetic test problem
Fluorescence kinetics of flavin adenine dinucleotide in different microenvironments
Fluorescence kinetics of flavin adenine dinucleotide was measured in a wide time and spectral range in different media, affecting its intra- end extramolecular interactions, and analyzed by a new method based on compressed sensing
Efficient Resolution of Anisotropic Structures
We highlight some recent new delevelopments concerning the sparse
representation of possibly high-dimensional functions exhibiting strong
anisotropic features and low regularity in isotropic Sobolev or Besov scales.
Specifically, we focus on the solution of transport equations which exhibit
propagation of singularities where, additionally, high-dimensionality enters
when the convection field, and hence the solutions, depend on parameters
varying over some compact set. Important constituents of our approach are
directionally adaptive discretization concepts motivated by compactly supported
shearlet systems, and well-conditioned stable variational formulations that
support trial spaces with anisotropic refinements with arbitrary
directionalities. We prove that they provide tight error-residual relations
which are used to contrive rigorously founded adaptive refinement schemes which
converge in . Moreover, in the context of parameter dependent problems we
discuss two approaches serving different purposes and working under different
regularity assumptions. For frequent query problems, making essential use of
the novel well-conditioned variational formulations, a new Reduced Basis Method
is outlined which exhibits a certain rate-optimal performance for indefinite,
unsymmetric or singularly perturbed problems. For the radiative transfer
problem with scattering a sparse tensor method is presented which mitigates or
even overcomes the curse of dimensionality under suitable (so far still
isotropic) regularity assumptions. Numerical examples for both methods
illustrate the theoretical findings
Restricted Isometries for Partial Random Circulant Matrices
In the theory of compressed sensing, restricted isometry analysis has become
a standard tool for studying how efficiently a measurement matrix acquires
information about sparse and compressible signals. Many recovery algorithms are
known to succeed when the restricted isometry constants of the sampling matrix
are small. Many potential applications of compressed sensing involve a
data-acquisition process that proceeds by convolution with a random pulse
followed by (nonrandom) subsampling. At present, the theoretical analysis of
this measurement technique is lacking. This paper demonstrates that the th
order restricted isometry constant is small when the number of samples
satisfies , where is the length of the pulse.
This bound improves on previous estimates, which exhibit quadratic scaling
Guaranteed clustering and biclustering via semidefinite programming
Identifying clusters of similar objects in data plays a significant role in a
wide range of applications. As a model problem for clustering, we consider the
densest k-disjoint-clique problem, whose goal is to identify the collection of
k disjoint cliques of a given weighted complete graph maximizing the sum of the
densities of the complete subgraphs induced by these cliques. In this paper, we
establish conditions ensuring exact recovery of the densest k cliques of a
given graph from the optimal solution of a particular semidefinite program. In
particular, the semidefinite relaxation is exact for input graphs corresponding
to data consisting of k large, distinct clusters and a smaller number of
outliers. This approach also yields a semidefinite relaxation for the
biclustering problem with similar recovery guarantees. Given a set of objects
and a set of features exhibited by these objects, biclustering seeks to
simultaneously group the objects and features according to their expression
levels. This problem may be posed as partitioning the nodes of a weighted
bipartite complete graph such that the sum of the densities of the resulting
bipartite complete subgraphs is maximized. As in our analysis of the densest
k-disjoint-clique problem, we show that the correct partition of the objects
and features can be recovered from the optimal solution of a semidefinite
program in the case that the given data consists of several disjoint sets of
objects exhibiting similar features. Empirical evidence from numerical
experiments supporting these theoretical guarantees is also provided
Learning Partially Shared Dictionaries for Domain Adaptation
Abstract. Real world applicability of many computer vision solutions is constrained by the mismatch between the training and test domains. This mismatch might arise because of factors such as change in pose, lighting conditions, quality of imaging devices, intra-class variations in-herent in object categories etc. In this work, we present a dictionary learning based approach to tackle the problem of domain mismatch. In our approach, we jointly learn dictionaries for the source and the target domains. The dictionaries are partially shared, i.e. some elements are common across both the dictionaries. These shared elements can rep-resent the information which is common across both the domains. The dictionaries also have some elements to represent the domain specific information. Using these dictionaries, we separate the domain specific information and the information which is common across the domains. We use the latter for training cross-domain classifiers i.e., we build classi-fiers that work well on a new target domain while using labeled examples only in the source domain. We conduct cross-domain object recognition experiments on popular benchmark datasets and show improvement in results over the existing state of art domain adaptation approaches.
The merit of high-frequency data in portfolio allocation
This paper addresses the open debate about the usefulness of high-frequency (HF) data in large-scale portfolio allocation. Daily covariances are estimated based on HF data of the S&P 500 universe employing a blocked realized kernel estimator. We propose forecasting covariance matrices using a multi-scale spectral decomposition where volatilities, correlation eigenvalues and eigenvectors evolve on different frequencies. In an extensive out-of-sample forecasting study, we show that the proposed approach yields less risky and more diversified portfolio allocations as prevailing methods employing daily data. These performance gains hold over longer horizons than previous studies have shown
- …